{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Python编程环境"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Guido Van Rossum和版本 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### [Guido](https://gvanrossum.github.io/)是[Python](https://www.python.org/)的作者![avatar](http://bazhou.blob.core.windows.net/learning/mpp/guido.jpg)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "在下面的访谈中，[Peter Norvig](https://en.wikipedia.org/wiki/Peter_Norvig)也没有念对他的名字，Guido的主页有它的荷兰语发音。\n",
    "<br><br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "<video width=\"80%\" controls src=\"http://bazhou.blob.core.windows.net/learning/mpp/142_Guido_Van_Rossum.mp4\" />"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    ">他在[stackoverflow](https://stackoverflow.com/users/818274/guido-van-rossum)和[github](https://github.com/gvanrossum)上非常活跃。他的[twitter](https://twitter.com/gvanrossum)更新很勤。\n",
    "\n",
    "![Guido_twitter](http://bazhou.blob.core.windows.net/learning/mpp/guido_tweet_996.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "- 抛开Python的[前世、今生和未来](http://bazhou.blob.core.windows.net/learning/mpp/python-past-present-and-future-with-guido-van-rossum.mp3) 第一个现实问题是：版本。\n",
    "- Python有两个版本：2和3。\n",
    "- 我们用版本3，再确切些，3.6.5。\n",
    "- 版本问题会引起很大的麻烦，这门课的解决方法是Docker。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### 我们使用Python[3.6](https://docs.python.org/3/whatsnew/3.6.html)![py3ver](http://bazhou.blob.core.windows.net/learning/mpp/py3ver.gif)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Shell Script  REPL"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- Shell 命令输入处"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- Script 命令"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- REPL Read-Evaluate-Print-Loop"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### 挑战：获得这门课程的词汇表"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- 17段视频的字幕文件"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "![shell_challenge](http://bazhou.blob.core.windows.net/learning/mpp/shell_challenge.gif)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "- 输入数据[例子](http://bazhou.blob.core.windows.net/learning/mpp/msxpy/16_253_6.2-wCnbczfN91s.txt)\n",
    "\n",
    "4\n",
    "\n",
    "00:00:14,010 --> 00:00:19,009\n",
    "\n",
    "In practice, you'll be working with data of\n",
    "different types: numerical values, strings,\n",
    "\n",
    "5\n",
    "\n",
    "00:00:19,009 --> 00:00:21,279\n",
    "\n",
    "booleans and so on.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "- 用管道连接脚本"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "```bash\n",
    "cat *.txt|./clean.sh #清洗字幕文件，分词\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "```bash\n",
    "cat *.txt|./clean.sh|sort|uniq #去掉重复，排序\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "```bash\n",
    "cat *.txt|./clean.sh|sort|uniq|wc #统计单词数\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "- 清洗字幕脚本 [clean.sh](http://bazhou.blob.core.windows.net/learning/mpp/msxpy/clean.sh)\n",
    "\n",
    "```bash\n",
    "#!/bin/sh\n",
    "tr '[:blank:]' '\\n'|tr '[:upper:]' '[:lower:]'|tr -d '\\r'|grep -vE \"'\"|grep -vE \"\\.\"|tr -d '[:punct:]'|grep -vE \"^[^a-zA-Z].*\"|grep -vE \".*[0-9].*\"\n",
    "\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Jupyter\n",
    "\n",
    "[命名](https://news.ycombinator.com/item?id=16978364)\n",
    "![jupiter](http://bazhou.blob.core.windows.net/learning/mpp/jupiter.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## Jupyter\n",
    "\n",
    ">Jupyter像Shell一样，在cell里编辑，在cell里运行，循环往复\n",
    "\n",
    "- R 编辑cell，Enter\n",
    "- E 运行cell，Shift+Enter\n",
    "- P 打印cell\n",
    "- L 下个cell\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "hello world\n"
     ]
    }
   ],
   "source": [
    "print(\"hello world\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": []
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  },
  "livereveal": {
   "scroll": true
  },
  "rise": {
   "enable_chalkboard": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
